Spoken Interface for Correcting Phoneme Recognition Errors in Learning of Unknownwords
نویسندگان
چکیده
This paper describes a novel method that enables users to teach systems the phoneme sequences of new words through speech interaction. Using the method, users can correct mis-recognized phoneme sequences incrementally by making corrective utterances. Each corrective utterance may include the whole or a segment of the word. During the interaction, if the correction using the utterance results in a better phoneme sequence than the previous one, a user can stop the interaction or make a corrective utterance again. Otherwise the user can reject the utterance. The originalities of this method are 1) interactive correction by speech, 2) the use of spoken word segments for locating mis-recognized phonemes and, 3) the use of generalized posterior probability (GPP) as a measure of correcting mis-recognized phonemes. The experimental results show that the proposed method achieved 96.8% in phoneme accuracy and 79.1% in word accuracy, with less than seven corrective utterances.
منابع مشابه
Allophone-based acoustic modeling for Persian phoneme recognition
Phoneme recognition is one of the fundamental phases of automatic speech recognition. Coarticulation which refers to the integration of sounds, is one of the important obstacles in phoneme recognition. In other words, each phone is influenced and changed by the characteristics of its neighbor phones, and coarticulation is responsible for most of these changes. The idea of modeling the effects o...
متن کاملInfluences of spoken word planning on speech recognition.
In 4 chronometric experiments, influences of spoken word planning on speech recognition were examined. Participants were shown pictures while hearing a tone or a spoken word presented shortly after picture onset. When a spoken word was presented, participants indicated whether it contained a prespecified phoneme. When the tone was presented, they indicated whether the picture name contained the...
متن کاملRecurrent Neural Network-Based Phoneme Sequence Estimation Using Multiple ASR Systems' Outputs for Spoken Term Detection
This paper describes a novel correct phoneme sequence estimation method that uses a recurrent neural network (RNN)-based framework for spoken term detection (STD). In an automatic speech recognition (ASR)-based STD framework, ASR performance (word or subword error rate) affects STD performance. Therefore, it is important to reduce ASR errors to obtain good STD results. In this study, we use an ...
متن کاملFast Approximate Spoken Term Detection from Sequence of Phonemes
We investigate the detection of spoken terms in conversational speech using phoneme recognition with the objective of achieving smaller index size as well as faster search speed. Speech is processed and indexed as a sequence of one best phoneme sequence. We propose the use of a probabilistic pronunciation model for the search term to compensate for the errors in the recognition of phonemes. Thi...
متن کاملEvaluation of DNN-based Phoneme Estimation Approach on the NTCIR-12 SpokenQuery&Doc-2 SQ-STD Subtask
This paper proposes a correct phoneme sequence estimation method using a deep neural network (DNN)-based framework for spoken term detection (STD). We use a DNN architecture as a correct phoneme estimator. The DNN-based estimator estimates a correct phoneme sequence of an utterance from some sorts of phoneme-based transcriptions produced by multiple ASR systems in post-processing, for reducing ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011